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Using the LASSO’s Dual for Regularization in 
Sparse Signal Reconstruction from Array Data 

Christoph F. Mecklenbrauker, Peter Gerstoft, Erich Zochmann 


Abstract 

Waves from a sparse set of source hidden in additive noise are observed by a sensor array. We 
treat the estimation of the sparse set of sources as a generalized complex-valued LASSO problem. The 
corresponding dual problem is formulated and it is shown that the dual solution is useful for selecting 
the regularization parameter of the LASSO when the number of sources is given. The solution path of 
the complex-valued LASSO is analyzed. For a given number of sources, the corresponding regularization 
parameter is determined by an order-recursive algorithm and two iterative algorithms that are based on 
a further approximation. Using this regularization parameter, the DOAs of all sources are estimated. 

Index Terms 


sparsity, generalized LASSO, duality theory 

I. Introduction 

This paper contributes to the area of sparse signal estimation for sensor array processing. Sparse signal 
estimation techniques retrieve a signal vector from an undercomplete set of noisy measurements when 
the signal vector is assumed to have only few nonzero components at unknown positions. Research in 
this area was spawned by the Least Absolute Shrinkage and Selection Operator (LASSO) 01. In the 
related field of compressed sensing, this sparse signal reconstruction problem is known as the atomic 
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decomposition problem Q. The early results for sparse signals ||3]|, iQ, Q have been extended to 
compressible (approximately sparse) signals and sparse signals buried in noise 0, Q, ||8l, Q, lITOll 
which renders the framework applicable to problems in array processing. 

Similar to ifTTI . lIT^ . the LASSO is generalized and formulated for complex-valued observations 
acquired from a sensor array. It is shown here that the corresponding dual vector is interpretable as 
the output of a weighted matched filter (WMF) acting on the residuals of the linear observation model, 

cf. ina. 

The regularization parameter ^ in LASSO defines the trade-off between the model fit and the estimated 
sparsity order K given by the number of estimated nonzero signal components. When the sparsity order 
Kq is given, choosing a suitable value for the LASSO regularization parameter /r remains a challenging 
task. The homotopy techniques lfT4ll . ITSl . iTThl provide an approach to sweep over a range of p values 
to select the signal estimate with the given Kq. 

The maximum magnitudes of the dual vector can be used for selecting the regularization parameter 
of the generalized LASSO. This is the basis for an order-recursive algorithm to solve the sparse signal 
reconstruction problem IThll . iflTl . ifTSi for the given Kq. In this work, a fast and efficient choice of ^ is 
proposed for direction of arrival estimation from array data. The choice exploits the sidelobe levels of the 
array’s beampattem. We motivate this choice after proving several relations between the regularization 
parameter p, the LASSO residuals, and the LASSO’s dual solution. 

The main achievements of this work are summarized as follows: We extend the convex duality theory 
ifTTIl from the real-valued to the complex-valued case and formulate the corresponding dual problem to 
the complex-valued LASSO. We show that the dual solution is useful for selecting the regularization 
parameter of the LASSO. Three signal processing algorithms are formulated and evaluated to support 
our theoretical results and claims. 

A. Notation 

Matrices A,B,... and vectors a,b,... are complex-valued and denoted by boldface letters. The 
zero vector is 0. The Hermitian transpose, inverse, and Moore-Penrose pseudo inverse are denoted as 
, X~^, X^ respectively. We abbreviate X~^ = {X^)~^. The complex vector space of dimension N 
is written as C^. Af{A) is the null space of A and span(A) denotes the linear hull of A. The projection 
onto span(A) is P^. The Ip-novm is written as || • ||p. For a vector x E C^, ||a;||oo = \xm\, for 

a matrix X E we define ||A^||cxd = max max iX^ml- 

l<n<Afl<m<M 
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II. Problem formulation 

We start from the following problem formulation: Let y G and A G . Find the sparse 

solution G for given sparsity order Kq G N such that the squared data residuals are minimal, 

11 2 

= argmin ||y — A® II 2 subject to \\x llo < Ko , (PO) 

X 

where || • ||p denotes the £p-norm. The problem (IPOl) is known as £o-reconstruction. It is non-convex and 
hard to solve llT9l . Therefore, the ^o'Constraint in (IPOl) is commonly relaxed to an £i constraint which 

renders the problem (IPll) to be convex . Further, a matrix D is introduced in the formulation of the 

constraint which gives flexibility in the problem definition. Let the number of rows of D be arbitrary at 
first. In Sec. HII] suitable restrictions on D are imposed where needed. Several variants are discussed in 
m. This gives 

11 2 

xe^ = argmin||y — Aa ;||2 subject to ||Da;||i < e . (PI) 

X 

In the following, problem (IPll) is referred to as the complex-valued generalized LASSO problem. Incor¬ 
porating the li norm constraint into the objective function results in the equivalent formulation (iPFl) . 

Xi^ = avgmm{\\y - Ax\\l +y\\Dx\\i) . (PP) 

X 

The equivalence of (PO) and (PF) requires suitable conditions to be satisfied such as the restricted isometry 
property (RIP) condition or mutual coherence condition imposed on A, cf. |[20ll . 1211 . Il4l . Under such 
condition, the problems (IPOl) and (IPPI) yield the same sparsity order, Kq = K with K = ||a;^J|o, if the 
regularization parameter y in (IPFI) is suitably chosen. The algorithms of Section |Vl] calculate suitable 
regularization parameters in this sense. 

III. Dual problem to the generalized LASSO 

The generalized LASSO problem ifTTl is written in constraint form, all vectors and matrices are assumed 
to be complex-valued. The following discussion is valid for arbitrary N,M G N: both the over-determined 
and the under-determined cases are included. Following ll22ll . ll2^ . a vector jz G and an equality 
constraint z = Dx are introduced to obtain the equivalent problem 

min (11^ — Aslll-f p||z||i) subject to z = Dx . (1) 

The complex-valued dual vector u = (ui,...,um)^ is introduced and associated with this equality 
constraint. The corresponding Lagrangian is 

£(*, z, u)= \\y — Ax \\2 + mII^IIi + [u^{Dx — z)] (2) 


£i{x,u) -f C 2 {z,u). 


(3) 
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To derive the dual problem, the Lagrangian is minimized over x and z. The terms involving x are 

Ci{x,u) = \\y — Ax\\ 2 +Ke Dx^ . (4) 

The terms in ((H) involving z are 

C 2 {z,u) = iJ.\\z\\i-Re{u^z) . (5) 

The value x minimizing (IH) is found by differentiation, dCi/dx = 0. This gives 

D^u = 2A^ {y — Ax) ( 6 ) 

whereby 

A^Ax = A^y — . (7) 

If D^u G span(A^) the solution to ([7]) becomes, 

x = A+y + ^-l-{A^A)+D^u, ( 8 ) 

'- V -' 2 

xls 

where (•)■*■ denotes the Moore-Penrose pseudoinverse. The Moore-Penrose pseudoinverse is defined 
and unique for all matrices X. In the following, we assume that A has full row-rank and A'*' = 
A^{AA^)~^ is a right-inverse Il24l . Here, ^ G AA(A) is a nullspace term which enables x to deviate 
from the least norm solution A^y. The nullspace M{A) is G C^| = 0}. By identifying ^ 

we specialize ([8]) to the solution of (IPPI) . 

Xi^ = A+y + xf^^^ - \{A^A)+D^u. (9) 

Thus, the solution to the generalized LASSO problem (O consists of three terms, as illustrated in Fig. 
[T] The first two terms are the least norm solution A^y and the nullspace solution ^ which together 
form the unconstrained least squares (LS) solution ®ls- The third term in ® is associated with the 
dual solution. Fig. |2] shows the three terms of @ individually for a simple array-processing scenario. 
The continuous angle 9 is discretized uniformly in [—90,90]° using 361 samples and the wavefield is 
observed by 30 sensors resulting in a complex-valued 30 x 361 A matrix (see section IIV-AI) . At those 
primal coordinates m which correspond to directions of arrival at —3°, 4.5° and 74.5° in Fig. |2j the 
three terms in dO]) sum constructively giving a non-zero Xm (“the mth source position is active”), while 
for all other entries they interfere destructively. Constructive interference is illustrated in Fig. [T] which is 
in constrast to the destructive interference when the three terms in (|9l) sum to zero. This is formulated 
rigorously in Corollary [T] 
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Fig. 1. Sketch of the relations between the primal solution and the terms in (|9](: least norm solution a;ieast norm, least squares 
solution tCLS, and the sparse solutions Xig , The nullspace term is any vector along the line perpendicular to 

span(A+). The red arrow represents the last term in © which is perpendicular to a:™". 


We evaluate dH) at the minimizing solution x and express the result solely hy the dual u. Firstly, we 
expand 

\\y ~ = \\y\\\ + ll^^lli “ 2'R.e{y^Ax} (10) 


Secondly using dh]), 


Dx = {D^u)^x = 2{y — Ax)^ Ax 

= 2y^ Ax — 2||Aa:||| 


Adding Eq. dTOb and the real part of (ITT]) gives 

y\\l - W^^Wl 


( 11 ) 


Ci{x,u) 


y^y- \\y-D^u\\l , 


(12) 
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Fig. 2. Numerical example solution terms in Eq. 19} versus direction of arrival (DOA). 


where we used ([8]) which assumes D^u E spaii(A^) and introduced the ahhreviations 

D = ^DA+, (13) 

y = P^y , with = AA+ (14) 


Due to the fundamental theorem of linear algebra, for an arbitrary vector v E span(A^) can be formulated 
as U^v = 0, where U is a unitary basis of the null space J\f{A). With v = D^u, this becomes 
{DU)^u = 0, resulting in 


inf Ci{x, u) 

X 


TT ^ 119 

y y -\\y - D u\\^ 

—oo, 


Next dS]) is minimized with respect to 2 , see Appendix A, 


if {DU)^u = 0, 
otherwise. 


(15) 


inf C 2 {z,u) 

Z 


0, if ||rr||oo ^ /r 
—oo, otherwise. 


(16) 
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Combining (fTSl) and (fT^ . the dual problem to the generalized LASSO (IPll) is, 


T_T . . 

max y y — \\y — 

D^'uWl 

(17a) 

ugC" 



subject to It 

VI 
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(17b) 

{DU)^u 

= 0. 

(17c) 


Equation @ is solvable for u if the row space constraint (I17cl) is fulfilled. In this case, solving ® 
directly gives 

Theorem 1. If D is non-singular, the dual vector u is the output of a weighted matched filter acting on 
the vector of residuals, i.e. 

u = 2D~^A^{y - Axif) , (18) 

where is the generalized LASSO solution \P1'\ . 

The dual vector u gives an indication of the sensitivity of the primal solution to small changes in 
the constraints of the primal problem (cf. |[22ll : Sec. 5.6). For the real-valued case the solution to (IPLI) 
is more easily constructed and better understood via the dual problem ifTTI . Theorem [J asserts a linear 
one-to-one relation between the corresponding dual and primal solution vectors also in the complex¬ 
valued case. Thus, any results formulated in the primal domain are readily applicable in the dual domain. 
This allows a more fundamental interpretation of sequential Bayesian approaches to density evolution 
for sparse source reconstruction ifTTl . ifTSl : they can be rewritten in a form that shows that they solve 
a generalized complex-valued LASSO problem and its dual. It turns out that the posterior probability 
density is strongly related to the dual solution ITSl . ESI . 

The following corollaries clarify useful element-wise relations between the primal and dual solutions: 
Corollary [T] relates the magnitudes of the corresponding primal and dual coordinates. Further, Corollary |2] 
certifies what conditions on D are sufficient for guaranteeing that the phase angles of the corresponding 
primal and dual coordinates are equal. Finally, Corollary [3] states that both the primal and the dual 
solutions to (IPLI are piecewise linear in the regularization parameter p. 

Corollary 1. For a diagonal matrix D with real-valued positive diagonal entries, we conclude: If the 
mth primal coordinate is active, i.e. X£^^rn 7^ 0 then the box constraint M7M is tight in the mth dual 
coordinate. Formally, 


7 ^ 0 \Um\ — /^) 


(m = 1,..., M). 


(19) 











Fig. 3. Illustration of the LASSO path: Number of active indices versus the regularization parameter /r. Increments in the 
active set occur at 

The proof is given in Appendix B. 

Thus, the mth dual coordinate hits the boundary as the mth primal coordinate becomes active. Con¬ 
versely, when the bound on \um\ is loose (i.e. the constraint on Um is inactive) , the corresponding primal 
variable Xm is zero (the mth primal coordinate is inactive ). The active set Ai is 

M = C {m||tim|=/r} = U. (20) 

Here, we have also defined the dual active set U which is a superset of Ai in general. This is due to 
Corollary [U which states an implication in ([T^ only, but not an equivalence. The active set Ai implicitly 
depends on the choice of /r in problem (iPCl) . Let A4 contain exactly K indices, 

Ai = {mi, m 2 ,.rriK}- (21) 

The number of active indices versus /r is illustrated in Fig. |3] ESI. Starting from a large choice of 
regularization parameter fj, and then decreasing, we observe incremental changes in the active set Ai at 
specific values of the regularization parameter, i.e., the candidate points of the LASSO path ESI . 
The active set remains constant within the interval ^jAp > h > By decreasing //, we enlarge the 

sets Ai and U. By Eq.(| 201). we see that U may serve as a relaxation of the set of active indices Ai. 

Corollary 2. If matrix D is diagonal with real-valued positive diagonal entries, then the phase angles 
of the corresponding entries of the dual and primal solution vectors are equal. 


arg(um) = arg(x£^,m), Vm E Ad 


(22) 

















Corollary 3. The primal and the dual solutions to the complex-valued generalized LASSO problem hPl'\ 
are continuous and piecewise linear in the regularization parameter p > 0. The changes in slope occur 
at those values for p where the set of active indices AA changes. 

The proofs for these corollaries are given in Appendix B. 


A. Relation to the Iq solution 


It is now assumed that M. defines the indices of the K non-zero elements of the corresponding 
solution. In other words: the and Iq solutions share the same sparsity pattern. The Iq solution with 
sparsity order K is then obtained hy regressing the K active columns of A to the data y in the least- 
squares sense. Let 


A^/i — [Omi) ®m2 ) • ■ ■ ) ^rriK ] 


(23) 


where denotes the mth column of A. The Iq solution becomes (cf. Appendix C) 

= ^mV ■ ( 24 ) 

Here, = {A^Am)~^A^ is the left inverse of A_s 4 . By subtracting dO]) from (l24b and restricting 
the equations to the contracted basis Aj^ yields 

Li-M(x£o,M — D^Um 

= Dl^pe^^ . (25) 

In the image of A, the £o-reconstruction problem (IPOl ) and the generalized LASSO (IPLI) coincide 
if the LASSO problem is pre-informed (prior knowledge) by setting Dmm, m G Ad to zero. The 
prior knowledge is obtainable by an iterative re-weighting process Il2^ or by a sequential algorithm on 
stationary sources lITSl . 


IV. Direction of Arrival Estimation 

For the numerical examples, we model a uniform linear array (ULA), which is described with its 
steering vectors representing the incident wave for each array element. 

A. Array Data Model 

Let X = (xi,... ,xm)^ be a vector of complex-valued source amplitudes. We observe time-sampled 
waveforms on an array of N sensors which are stacked in the vector y. The following linear model for 
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the narrowband sensor array data y at frequency uj is assumed, 

y = Ax + n . (26) 


The mth column of the transfer matrix A is the array steering vector am for hypothetical waves from 
direction of arrival (DOA) 9m- To simplify the analysis all columns are normalized such that their £2 
norm is one. The transfer matrix A is constructed by sampling all possible DOAs, but only very are 
active. Therefore, the dimension of A is x M with <C M and x is sparse. The linear model (l26l) 
is underdetermined. 

The nmth element of A is modeled by 


A 


nm 


1 

/a 


exp[j(n 


l)7rsin 9m] ■ 


(27) 


Here 9m = -90° is the DOA of the mth hypothetical DOA to the nth array element. 

The additive noise vector n is assumed spatially uncorrelated and follows a zero-mean complex normal 
distribution with diagonal covariance matrix cr^T. 

Following a sparse signal reconstruction approach ifTTl . lITSl . this leads to minimizing the generalized 
LASSO Lagrangian 


\y — Aa ;||2 + y \\Dx\ 


(28) 


where the weighting matrix D gives flexibility in the formulation of the penalization term in (|28] |. Prior 
knowledge about the source vector leads to various forms of D. This provides a Bayesian framework 
for sequential sparse signal trackers ifTTl . lITSl . ll25l . Specific choices of D encourage both sparsity of 
the source vector and sparsity of their successive differences which is a means to express that the source 
vector is locally constant versus DOA liTTl . The minimization of (|2^ constitutes a convex optimization 
problem. Minimizing the generalized LASSO Lagrangian (|2^ with respect to x for given y, gives a 
sparse source estimate If rank(A) < M, (|28] | is no longer strictly convex and may not have a 
unique solution, cf. ifTTIl . 

Earlier approaches formulated this as a (ordinary) LASSO problem |[2, B, Q which is equivalent to 
(|28] | when specializing to D = I. 


B. Basis coherence 

The following examples feature different levels of basis coherence in order to examine the solution’s 
behavior. As described in ISTl, the basis coherence is a measure of correlation between two steering 
vectors and defined as the inner product between atoms, i.e. the columns of A. The maximum of these 
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inner products is called mutual coherence and is customarily used for performance guarantees of recovery 
algorithms. To state the difference formally: 


coh (aj, aj) 



(29) 


mutual coh(A) = || — I 


OO 


(30) 


The mutual coherence is hounded between 0 and 1 

The following noiseless example in Figs. |4] and [5] demonstrates the dual solution and the WMF output 
for iV = 30 and M = 361. In Fig. |4l the LASSO with /r = 1 is solved for a scenario with three sources 
at DOA —3°, 4.5°, 84.5° and all sources have same power level and are in-phase (see Fig. |4jr), whereas 
in Fig. |5j an additional fourth source at 8° is included. 

1) Low basis coherence: Figure |4] shows the performance when the steering vectors of the active 
sources have small basis coherence. The basis of source 1 is weakly coherent with source 2, coh « 0.02 
using (l29l ). 

Figure |4^ shows the normalized magnitude of the WMF (blue) and the normalized magnitude of the 
dual vector (black). The dual active set U defined in (l20l) is depicted in red color. This figure shows that 
the true source parameters (DOA and power) are well estimated. It is also seen here that the behavior 
of the WMF closely resembles the magnitude of the dual vector and the WMF may be used as an 
approximation of the dual vector. This idea is further explored in Sec. |Vll 

2 ) High basis coherence: Figure [5^ shows that the sources are not separable with the WMF, because 
the steering vectors belonging to source 2 and 3 are coherent, coh = 0.61 using (l29l ). The (generalized) 
LASSO approach is still capable of resolving all 4 sources. The DOA region defined by U is much 
broader around the nearby sources, allowing for spurious peaks close to the true DOA. Figure |5jr shows 
that the true source locations (DOA) are still well estimated, but for the 2“'^ source from left, the power 
is split into two bins, causing a poor source estimate. 


V. Solution path 


The LASSO solution path ifTTl . |[T5l gives the primal and dual solution vector versus the regularization 
parameter /r. The primal and dual trajectories are piece-wise smooth and related according to Corollaries 
mm The following figures show results from individual LASSO runs by varying fx. 

The problems (IPII) is complex-valued and the corresponding solution paths behave differently from what 
is described in Ref. ifTTl . In the following figures, only the magnitudes of the active primal coordinates 


Fig. 4. 


Fig. 5. 
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Dual (a) and primal (b) coordinates for 3 well separated sources with low basis coherence. 
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Dual (a) and primal (b) coordinates for 4 sources with higher basis coherence. 
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b) 



Fig. 6. Magnitudes of the solution paths versus /j, for the simulation parameters in Table |I] and SNR = 40 dB: (a) dual, and 
(b) primal vectors for the case of the complete basis. 


and the corresponding dual coordinates are illustrated. Note that Corollary |2] guarantees that the phases 
of the active primary solution elements and their duals are identical and independent from /r. 

Based on the observed solution paths, we notice that the hitting times (when \um\ = m) of the dual 
coordinates (at lower fi) are well predictable from the solution at higher p. 

For the following simulations and Figs. l6]-fT0l the signal to noise ratio (SNR) is defined as 

SNR= lOlogio (E||Aa;|||/E||n|||) dB. (31) 

SNR = 40 dB in Figs. [hJ-lH whereas SNR = 20 dB in Fig. [TO] 

A. Complete Basis 

First (Fig. discusses the dual and primal solution for a complete basis with M = 6, sparsity order 
iF = 6, and iV = 30 sensors linearly spaced with half wavelength spacing. This simulation scenario is 
not sparse and all steering vectors for 1 < m < M will eventually be used to reconstruct the data 
for small p. The source parameters that are used in the simulation scenario are given in Table H 

We discuss the solution paths in Figs. [6]-fT0l from right {p = oo) to left {p = 0). Initially all dual 
solution paths are horizontal (slope = 0), since the primal solution xp = 0 for /r > 2\\D~^yWoo- In 
this strongly penalized regime, the dual vector is the output of the WMF u = 2D~^ A^y which does 
not depend on p. 
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Fig. 7. Magnitudes of the solution paths versus fj, for the simulation parameters in Table U and SNR = 40 dB: (a) dual, and 
(b, c and d) primal vectors for the case of an 80-vector overcomplete basis . For the primal coordinates the peak within ±2 bins 
from the true bin is tracked based on (b) maximum (c) energy. The magnitudes of the corresponding elements of xtg are shown 
in (d). 
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Fig. 8. Dual and primal coordinates at selected values of ^ for 81-vector overcomplete basis for SNR = 40 dB. 


At the point = 2\\D~^y\\oo the first dual coordinate hits the boundary (I17hl) . This occurs at 
y} = 21 in Fig. and the corresponding primal coordinate becomes active. As long the active set 
M. does not change, the magnitude of the corresponding dual coordinate is y, due to Corollary [T] The 
remaining dual coordinates change slope relative to the basis coherence level of the active set. 

As y decreases, the source magnitudes at the primal active indices increase since the ^i-constraint in 
(Ip FI) becomes less important, see Fig. [^. The second source will become active when the next dual 
coordinate hits the boundary (at y^ = 17 in Fig. |^. 
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Fig. 9. For 10 noise realizations, magnitudes of the solution paths versus /r for the simulation parameters in Table U and 
SNR = 40 dB: (a) dual, and (b, c and d) primal vectors for the case of an 80-vector overcomplete basis . For the primal 
coordinates the peak within ±2 bins from the true bin is tracked based on (b) maximum (c) energy. The magnitudes of the 
corresponding elements of xi^ are shown in (d). 












































Fig. 10. As Fig. m but with SNR = 20 dB: 


When the active set is constant, the primary and dual solution is piecewise linear with [i, as proved 
in Corollary [3] The changes in slope are quite gentle, as shown for the example in Fig. 0 . Finally, at 
/i = 0 the problem (iPFb degenerates to an unconstrained (underdetermined) least squares problem. Its 
primal solution x = a;Ls, see ([8]), is not unique and the dual vector is trivial, u = 0. 
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No. 

DOA (°) 

Power (lin.) 

1 

-6.0 

4.0 

2 

-1.0 

7.0 

3 

4.0 

9.0 

4 

9.0 

7.0 

5 

14.0 

12.0 

6 

19.0 

5.0 


TABLE I 

Source parameters for simulation scenario 


B. Overcomplete Basis 

We now enlarge the basis to M = 81 with hypothetical source locations 6m £ [—20°, 20°] with 0.5° 
spacing, and all other parameters as before. The solution is now sparse. 

The LASSO path ITSl is illustrated in Fig. |7] where we expect the source location estimate within ±2 
bins from the true source location. The dual Fig. |7^ appears to be quite similar to Fig. [^. 

Corollary [3] gives that the primary solution should change linearly, as demonstrated for the complete 
basis in Fig. j^. Here we explain why this is not the case for the overcomplete basis primary solution in 
Fig. jTJr . This is understood by examining the full solution at selected values of p. (asterisk (*) in Fig. |7]l. 
At /r = 20 just one solution is active, only the black source (source 5) is active though one bin to the 
left, as shown in Fig. [8^2. The dual vector in Fig. [8^1-fig:path5dualprimaldl, has a broad maximum, 
explaining the sensitivity to offsets around the true DOA. The shape of this maximum is imposed by the 
dictionary; the more coherent the dictionary, the broader the maximum. Between /r = 16 and p = 11, 
the black source appears constant, this is because at large values the source is initially located in a 
neighboring bin. As p decreases, the correct bin receives more power, see Fig. [8j52 and Fig. [8]:2 for 
p = 15 and p = 10, respectively. When it is stronger than the neighboring bin at /r < 11, see Fig. [8}12, 
this source power starts increasing again. This trading in source power causes the fluctuations in Fig. |7Jr. 

One way to correct for this fluctuation is to sum the coherent energy for all bins near a source, i.e., 
multiplying the source vector with the corresponding neighbor columns of A, which also touch the 
boundary (marked region in Fig. [8]) and then compute the energy based on the average received power 
at each sensor. This gives a steady rise in power as shown in Fig. [7]:. 

We motivate solving (iPFl) as a substitute for ^Q-reconstruction (IPOb — finding the active indexes of the 





19 


ii solution, see Fig. |7]1. The Iq primal can be found with the restricted basis and the value of the £i 
primal from dUl, which depends on n, or by just solving (l24b . 

To investigate the sensitivity to noise, 10 LASSO paths are simulated for 10 noise realizations for both 
SNR = 40 dB (Fig. |9l) and SNR = 20 dB (Fig. [TOll . The dual (Fig. and Fig. [TOkl. appears quite stable 
to noise, but the primal |x£i| (Figs.|9jr andfTOb) show quite large variation with noise. This is because the 
noise causes the active indexes to shift and thus the magnitude to vary. The mapping to energy |xenergy| 
(Figs. |9j: and [TObl or the |x£o| solution (Figs. |9]i and [TOhl makes the solution much more stable. 

VI. Solution Algorithms 

Motivated by Theorem [T] and Corollary [U we propose the order-recursive algorithm in Table HIl for 
approximately solving problem (IPOl) by selecting a suitable regularization parameter ^ in problem (iPFl) . 
a faster iterative algorithm in Table |III1 and a dual-based iterative algorithm in Table |IVl 

As shown by Theorem [T] the dual vector is evaluated by a WMF acting on the LASSO residuals. 
The components of the dual vector which hit the boundary, i.e. \um\ = correspond to the active primal 
coordinates \xm\ > 0. As \ura\ = constitutes a necessary condition, this condition is at least |A4| times 
fulfilled. Informally, we express this as: “The dual vector must have |A4| peaks of height where the 
shaping is defined by the dictionary A and the weighting matrix D” 

The key observation is the reverse relation. By knowing the peak magnitudes of the dual vector, 
one estimates the appropriate /r-value to make i peaks hit the boundary. We denote this regularization 
parameter value as /i*. This is a necessary condition to obtain i active sources. 

We define the peak(u, z)-function which returns the largest local peak in magnitude of the vector 
u. A local peak is defined as an element which is larger than its adjacent elements. The peak function can 
degenerate to a simple sorting function giving the ith largest value, this will cause slower convergence 
in the algorithms below. 

Proposition 1. Assuming all sources to be separated such that there is at least a single bin in between, 
the peak function relates the regularization parameter to the dual vector via 

= peak (|ii(/r*)|,i) = peak . (32) 

Equation (l32l ) is a fixed-point equation for /r® which is demanding to solve. Therefore we approximate 
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rfl 


with previously obtained dual vector^j. At a potential new source position re, the dual vector is 
expanded as 
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UnifJ' ) — d-n ( y 'y ^ 

^ m&Mi 
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m£Mi-2 
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(33) 

(34) 

(35) 


D* 

x-' in 


-^ny 


(36) 


The approximations used in (I34l)-(l36l) are progressive. These approximations are good if the steering 
vectors corresponding to the active set are sufficiently incoherent: \a^am\ ~ 0 for re,m E Ad. Eq. (|3^ 
corresponds to the conventional beamformer A^y for a single snapshot. In the solution algorithms, the 
approximations (I34b-(l36l) are used for the selection of the regularization parameter /i only, thus the peaks 
in the conventional beamformer do not correspond to the X£^ solution. 

Our simulations have shown that a significant speed-up achievable, so we named it fast-iterative 
algorithm, cf Section IVI-BI 


From the box constraint (I17bll . the magnitude of the peak in u does not change much during the 
iteration over v. It is bounded by the difference in regularization parameter. For any /r* < we 

conclude from Corollary [T] and Proposition [T] 

peak(re(/r*“^), f) — peak(re(|U*), z) < — |U*. (37) 

'-V-' '-V-' 

Thus, the magnitude of the peak cannot change more than the corresponding change in the regular¬ 
ization parameter. The left hand side of (l37]) is interpretable as the prediction error of the regularization 
parameter and this shows that the prediction error is bounded. 

Assuming our candidate point estimates (/z*^, /z*^, ...) are correct, we follow a path of regularization 
parameters /z^, /z^, ... where /z^ is slightly higher than the lower end of the regularization interval. 

Specifically, /z^ = (1 — F)^*p + F with F < 1. For the numerical examples F = 0.9 is used. This 
F is chosen because the primal solution X£^ is closest to X£^ at the lower end of the interval. 

*For the first step, we define the WMF output as uq = 2D~^y. 
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In the following we focus on the order recursive algorithm, and indicate the differences to the other 
approaches. 

A. Recursive-In-Order algorithm 

The recursive-in-order algorithm in Table HIl finds one source at a time as /r is lowered. To this purpose 
it employs an approximation of the height of the ith local peak given a solution with (i — 1) peaks. The 
underlying assumption is that the next source will become active at the location corresponding to the 
dual coordinate of the next peak. Equation (l34l ) allows to approximate 

/r* = peak(u(//*), i) « peak(w(/i*“^), z) . (38) 

This assumption is not universally valid as it may happen that the coordinate corresponding to the 
(id-1)*^ peak becomes active first, although peak(u*“^, i) > peak(it*“^, id-1). In this case, two sources 
become active as the regularization parameter is chosen too low. This exception can be handled by, e.g., 
bisection in /i. 


Given: A € , D € diagR*^, y G 

Given: ffo G N , T G]0, 1[, xi^. 


1. — fraj Si — e||iC£j^||oo 

2 : = 2D~^A^ (y - Axi^) 

3: if \M\ < Ko 

4: W = {m| 1 - 

5: i = |W| -d 1 

6: /i = (1 — T) peak(ii,*“^, i) d- F peak(u*“^, i d- l) 

7: else if |A1| > Kq 

8: bisecting between and yl as defined in Eq. 02} 

9: end 

10: Output: y 


TABLE II 

Order-recursive algorithm to select y for given sparsity order Ko- 
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The recursive-in-order algorithm provided in Table HIl takes as input the dictionary A, the generalization 
matrix D, the measurement vector y, the given sparsity order Kq and the previous order LASSO solution 
. In line 1 the actual active set is determined hy thresholding and line 2 produces the dual vector hy 
Theorem [T] Line 2 can he omitted, if the LASSO solver makes the dual solution available, e.g., through 
primal-dual interior point methods or alternating direction method of multipliers. If the size of the active 
set of the previous LASSO solution is less than the given sparsity order Kq, the algorithm determines the 
dual active set U in line 4, cf. Eq. (l20b . The incremented cardinality of U is the new requested number of 
hitting peaks in the dual vector. Finally, line 6 calculates q based on the candidate point estimate (l38l) . 

B. Fast-Iterative Algorithm 

The approximation from Equation (1381 ) is not limited to a single iteration. Therefore, (l38l) can be 
extended further to 

q* pa peak('u(^*“^), i) 

?a peak('u(q*“^), i) 


pa peak('u(q°), i) = peak(2D ^A^y,i). (39) 

This observation motivates the iterative algorithm in Table |III] The main difference to the recursive-in- 
order algorithm is found in line 6. The peakfinder estimates the maximum of the peak. This leads 
to a significant speed-up, if sources are well separated and their basis coherence is low. 

C. Detection in the dual domain 

As a demonstrative example, we provide the fast iterative algorithm formulated solely in the dual 
domain in Table HVl Note that the gird-free atomic norm solutions |[30l . |[32l . |33l, |[37l . Il38l follow a 
similar approach. 

As asserted by (l20l ). searching for active indices in the dual domain is effectively a form of relaxation 
of the primal problem (IPEI) . This amounts to peak finding in fhe oufpuf of a WMF acfing on fhe residuals, 
cf. Theorem [T] In line 1, fhe acfive sef Ai is effectively approximafed by fhe relaxed sef U. Therefore, 
fhe £o solution is determined by regression on the relaxed set in line 2 and the primal active set is found 
by thresholding this solution in line 3. The remainder of the algorithm is equal the primal based ones. 
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Given: A € , D G diagR*^, y G 

Given: No G N , F G]0, 1[, xi^ . 

1: 

jVl — 5i — £||iC£^||oo 


2: 

= 2D-^A^ {y - Axe^) 


3: 

if \M\ < Ko 


4: 



5: 

i=\U\ + l 


6: 

y = {1 — F) peak(n'“'^, N) + F peak(n®“ 

\Ko + l) 

7: 

else if \M\ > Kq 


8: 

bisecting between and y* as defined in 

Eq. (m 

9: 

end 


10: 

Output: y, 



TABLE III 

Iterative primal based algorithm to select ^ for given sparsity order Kq . 


VII. Simulation 

In this section, the performance of the proposed dual estimation algorithms is evaluated based on 
numerical simulation. We use synthetic data from a uniform linear array with N = elements with half¬ 
wavelength spacing. The DOA domain is discretized hy 6m = (m — 1)^^^ — 90° with m = 1,..., M and 
M = 180. The simulation scenario has Kq = 8 far-held plane-waves sources (l26l) . The uncorrelated noise 
n is zero-mean complex-valued circularly symmetric normally distributed ~ AA(0, I), i.e. OdB power. 
Eight sources are stationary at = [—45, —30, —14, 9, 17, 30, 44, 72] degrees relative to broadside 
with constant power level (PL) [—5, 10, 5, 0, 11, 12, 9, 25] dB lITSl . 

The dual solution for the order-recursive approach, Table ini corresponds to the results shown in Fig. 
[TT] The faster iterative approach. Table jlllj yields the results in Fig. [T2| The dual solution using the 
primal solution from the previous iteration is interpreted as a WMF and used for the selection of p (left 
column). Next, the convex optimization is carried out for that value of p giving the dual solution. We 
plot the dual solution on a linear scale and normalized to a maximum value of 1 which is customary in 
implementations of the dual for compressed sensing OOl . lf32l . 1(3^ . The number of active sources (see 
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Given: A € D G diagR*^, y G 

Given: Tfo G N , F g]0, 1[, u. 


1: 14 = {m| 1 - -1^ < e^} 

2: Xi„ = A+y 

3: M = {m| |a;«o,m| > 5}, 5 = e||a:io||oo 

4: if|2V(|<A'o 

5: i = |W| + 1 

6: y = (1 — F) peak(n““^, F) + Fpeak(n*“^, Fq + l) 

7: else if |A^| > Kq 

8: bisecting between and p* as defined in Eq. ( 132b 

9: end 

10: Output: p 


TABLE IV 

Iterative dual based algorithm to select y, for given sparsity order Fq. 


right column in Figs. [TT] and \Y2\i are determined according to line 1 in Tables HIl and |III1 

For the order-recursive approach step 1 , Fig. [TTk . the /r is selected based on the main peak 9 = 72° 
and a large side lobe at 0 = 80°. Once the solution for that /x is obtained it turns out that there is no an 
active source in the sidelobe.The solution progresses steadily down the LASSO path. Figure [12] shows the 
faster iterative approach in Table |nl| for the 8-source problem. In the first iteration we use a p between 
the 8th and 9th peak based on the WMF solution (Fig. fT^I. There are many sidelobes associated with 
the source at 9 = 72°. As soon as the dominant source is determined, the sidelobes in the residuals 
are reduced and only 5 sources are observed. After two more iterations, all 8 sources are found at their 
correct locations. 

For both algorithms, the main CPU time is used in solving the convex optimization problem. Thus 
the iterative algorithm is a factor 8/3 faster in this case than the straightforward approach which strictly 
follows the LASSO path. The approach described in Table ini has approximately the same CPU time 
usage as the approach in Ref. lfT8]l . but it is conceptually simpler and provides deeper physical insight 
into the problem. 












Fig. 11. Dual coordinates for order-recursive approach corresponding to step p = 1 (a and b), p = 2 (c and d), and p = 8 
(e and f). Left column: Dual (dB) for the previous step which is used for selecting p (horizontal line). Right column: Dual 
(lin) normalized with p (maximum is 1), the true source locations are marked with o, and the actual value of p and number of 
sources found is also indicated. 


VIII. Conclusion 

The complex-valued generalized LASSO problem is convex. The corresponding dual problem is 
interpretable as a weighted matched Filter (WMF) acting on the residuals of the LASSO. There is a 
linear one-to-one relation between the dual and primal vectors. Any results formulated for the primal 
problem are readily extendable to the dual problem. Thus, the sensitivity of the primal solution to small 
changes in the constraints can be easily assessed. Further, the difference between the solutions and 
the is characterized via the dual vector. 


















































































Fig. 12. Dual coordinates iterative approach corresponding for localizing Kq = 8 sources for step i = 1 (a and b), i = 2 (c 
and d), and i = 3 (e and f). Left column: Dual (dB) for the previous step which is used for selecting /i (horizontal line). Right 
column: Dual (lin) normalized with /i (maximum is 1), the true source locations are marked with o, and the actual value of ^ 
and number of sources found is also indicated.. 


Based on mathematical and physical insight, an order-recursive and a faster iterative LASSO-hased 
algorithm are proposed and evaluated. These algorithms use the dual variable of the generalized LASSO 
for regularization parameter selection. This greatly facilitates computation of the LASSO-path as we can 
predict the changes in the active indexes as the regularization parameter is reduced. Further, a dual-hased 
algorithm is formulated which solves only the dual problem. The examples demonstrate the algorithms, 
confirming that the dual and primal coordinates are piecewise linear in the regularization parameter /r. 
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Appendix A 

Proof of (O: Set u = {ui,..., umY' G C^. From (IS), 

M 

^l\\z\\l - Re(it^z) (fikml - Re(rr^Zm)) (Al) 

m=l 

M 

— ^ ^ (M I I COS (j^rnm ) | \; (-^2) 

1 ^ 
m=l 

=^m 

where we set u’^Zm = |iim| The phase difference 4>mm depends on both Um and Zm- If all 

coefficients jlm in (IA21) are non-negative, jlm > 0, for all Zm G C, then 

min (/r||z||i — Re(ri'^z)) = 0, (A3) 

otherwise there is no lower hound on the minimum. Therefore, all |ttm| must he hounded, i.e. \um\ < 
/rVm = to ensure that all Jim > 0 for all possible phase differences — 1 < cos (Jrnm < 1- 

Finally, we note that ||w||oo = niaxm \ um\- 


Appendix B: Proofs of Corollaries [HIH and [3] 

Proof for Corollary [7] 

Let the objective function of the complex-valued generalized LASSO problem (iPFl) be 

^ = \\y — Ax \\2 + n\\Dx\\i . (Bl) 

In the following, we evaluate the subderivative Il35l as the set of all complex subgradients as 
introduced in lf36l . First, we observe 

= —2A^{y — Ax) + yd\\Dx\\i . (B2) 


Next, it is assumed that D is a diagonal matrix with positive real-valued diagonal entries. Then the 
subderivate c)||D®||i evaluates to 


5||Z)a;||i = < 



for XmT^O 


(B3) 


{z G C, \z\ < 1} for Xjn = 0. 

The minimality condition for is equivalent to setting (IB21) to zero. For all m with Xm 0 and with 
m, this gives 


DmmUm — 




7,^71 


(B4) 


It readily follows that lu^l = F for / 0 and Dmm / 0- 
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Proof for Corollary^ 

Starting from Eq. (IB41) . dividing by /r and invoking Corollary 1, we conclude for matrices D with 
positive diagonal entries and for m G Ad, 

arg(x„,) ^ {y - Ax) = Um , (B5) 

^mm 

where is the mth standard basis vector. This concludes the proof of Corollary |2l 
Proof for Corollary [3] 

For the primal vector, this was shown in the real-valued case by Tibshirani IT] and for the complex¬ 
valued case, this is a direct consequence of Appendix B in lITSl . For the dual vector, this was shown 
in the real-valued case by Tibshirani ifTTI and for the complex-valued case, this readily follows from 
Theorem 1: If the primal vector Xf^^ depends linearly on y, in (fT8l ) then so does the dual vector u. 

Appendix C: 4 solution 
The gradient (cf. Appendix B) of the data objective function is 

V \\y - Ax\\l = -2A^ {y - Ax) (Cl) 

For the active source components, Xm with m ^ M., the £o -constraint of (IPOl) is without effect and the 
solution results from setting the gradient to zero, i.e. solving the normal equations. 

A mV = => = A mV (^ 2 ) 

We set 

^io,A4 A ^ • (C3) 

This is inserted into (1C 11) . 

V \\y - Axi^^mWI = -2A^ {y - A{xi^^m - ^)) ■ (C4) 

Using gives 


D^um 

= (y - 

(C5) 


= {y - Am (x£o,m - ^)) 

(C6) 


= 2A^A_a4A 

(C7) 
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This results in 

A = I (C8) 

which depends on \i both explicitly and implicitly through Ad. If the set of nonzero elements of (IPOl ) is 
equal to the active set of (IPTI) . the solutions of (IPOb and (iPPl) differ hy (IC81) . 
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